International Journal devoted to Concept Theory , Classification , Indexing and Knowledge Representation Contents

نویسندگان

  • Elaine Ménard
  • Nouf Khashman
  • Svetlana Kochkina
  • Juan-Manuel Torres-Moreno
  • Patricia Velazquez-Morales
  • Fen Zhou
  • Pierre Jourlin
  • Priyanka Rawat
  • Peter Peinl
چکیده

The purpose of this work is to develop an ontologybased framework for developing an information retrieval system to cater to specific queries of users. For creating such an ontology, information was obtained from a wide range of information sources involved with brain tumour study and research. The information thus obtained was compiled and analysed to provide a standard, reliable and relevant information base to aid our proposed system. Facet-based methodology has been used for ontology formalization for quite some time. Ontology formalization involves different steps such as identification of the terminology, analysis, synthesis, standardization and ordering. A vast majority of the ontologies being developed nowadays lack flexibility. This becomes a formidable constraint when it comes to interoperability. We found that a facet-based method provides a distinct guideline for the development of a robust and flexible model concerning the domain of brain tumours. Our attempt has been to bridge library and information science and computer science, which itself involved an experimental approach. It was discovered that a faceted approach is really enduring, as it helps in the achievement of properties like navigation, exploration and faceted browsing. Computer-based brain tumour ontology supports the work of researchers towards gathering information on brain tumour research and allows users across the world to intelligently access new scientific information quickly and efficiently. Konkova, Elena, MacFarlane, Andrew, and Göker, Ayşe. “Analysing Creative Image Search Information Needs.” Knowledge Organization 43 no. 1: 13-21. 23 references. Abstract: Creative professionals in advertising, marketing, design and journalism search for images to visually represent a concept for their project. The main purpose of this paper is to present search facets derived from an analysis of documents known as briefs, which are widely used in creative industries as requirement documents describing information needs. The briefs specify the type of image required, such as the content and context of use for the image and represent the topic from which the searcher builds an image query. We take three main sources—user image search behaviour, briefs, and image search engine search facets— to examine the search facets for image searching in order to examine the following research question—are search facet schemes for image search engines sufficient for user needs, or is revision needed? We found that there are three main classes of user search facet, which include business, contextual and image related information. The key argument in the paper is that the facet “keyword/tag” is ambiguous and does not support user needs for more generic descriptions to broaden search or specific descriptions to narrow their search—we suggest that a more detailed search facet scheme would be appropriate. Creative professionals in advertising, marketing, design and journalism search for images to visually represent a concept for their project. The main purpose of this paper is to present search facets derived from an analysis of documents known as briefs, which are widely used in creative industries as requirement documents describing information needs. The briefs specify the type of image required, such as the content and context of use for the image and represent the topic from which the searcher builds an image query. We take three main sources—user image search behaviour, briefs, and image search engine search facets— to examine the search facets for image searching in order to examine the following research question—are search facet schemes for image search engines sufficient for user needs, or is revision needed? We found that there are three main classes of user search facet, which include business, contextual and image related information. The key argument in the paper is that the facet “keyword/tag” is ambiguous and does not support user needs for more generic descriptions to broaden search or specific descriptions to narrow their search—we suggest that a more detailed search facet scheme would be appropriate. Ménard, Elaine, Khashman, Nouf, Kochkina, Svetlana, TorresMoreno, Juan-Manuel, Velazquez-Morales, Patricia, Zhou, Fen Jourlin, Pierre, Rawat, Priyanka, Peinl, Peter, Linhares Pontes, Elvys, and Brunetti, Ilaria. “A Second Life for TIIARA: From Bilingual to Multilingual!” Knowledge Organization Knowledge Organization 43 no. 1: 22-34. 39 references. Abstract: Multilingual controlled vocabularies are rare and often very limited in the choice of languages offered. TIIARA (Taxonomy for Image Indexing and RetrievAl) is a bilingual taxonomy developed for image indexing and retrieval. This controlled vocabulary offers indexers and image searchers innovative and coherent access points for ordinary images. The preliminary steps of the elaboration of the bilingual structure are presented. For its initial development, TIIARA included only two languages, French and English. As a logical follow-up, TIIARA was translated into eight languages—Arabic, Spanish, Brazilian Portuguese, Mandarin Chinese, Italian, German, Hindi and Russian—in order to increase its international scope. This paper briefly describes the different stages of the development of the bilingual structure. The processes used in the translations are subsequently presented, as well as the main difficulties encountered by the translators. Adding more languages in TIIARA constitutes an added value for a controlled vocabulary meant to be used by image searchers, who are often limited by their lack of knowledge of multiple languages. Multilingual controlled vocabularies are rare and often very limited in the choice of languages offered. TIIARA (Taxonomy for Image Indexing and RetrievAl) is a bilingual taxonomy developed for image indexing and retrieval. This controlled vocabulary offers indexers and image searchers innovative and coherent access points for ordinary images. The preliminary steps of the elaboration of the bilingual structure are presented. For its initial development, TIIARA included only two languages, French and English. As a logical follow-up, TIIARA was translated into eight languages—Arabic, Spanish, Brazilian Portuguese, Mandarin Chinese, Italian, German, Hindi and Russian—in order to increase its international scope. This paper briefly describes the different stages of the development of the bilingual structure. The processes used in the translations are subsequently presented, as well as the main difficulties encountered by the translators. Adding more languages in TIIARA constitutes an added value for a controlled vocabulary meant to be used by image searchers, who are often limited by their lack of knowledge of multiple languages. Vaidya, Praveenkumar and Harinarayana, N. S. “The Comparative and Analytical Study of LibraryThing Tags with Library of Congress Subject Headings.” Knowledge Organization 43 no. 1: 35-43. 30 references. Abstract: The internet in its Web 2.0 version has given an opportunity among users to be participative and the chance to enhance the existing system, which makes it dynamic and collaborative. The activity of social tagging among researchers to organize the digital resources is an interesting study among information professionals. The one way of organizing the resources for future retrieval through these user-generated terms makes an interesting analysis by comparing them with professionally created controlled vocabularies. Here in this study, an attempt has been made to compare Library of Congress Subject Headings (LCSH) terms with LibraryThing social tags. In this comparative analysis, the results show that social tags can be used to enThe internet in its Web 2.0 version has given an opportunity among users to be participative and the chance to enhance the existing system, which makes it dynamic and collaborative. The activity of social tagging among researchers to organize the digital resources is an interesting study among information professionals. The one way of organizing the resources for future retrieval through these user-generated terms makes an interesting analysis by comparing them with professionally created controlled vocabularies. Here in this study, an attempt has been made to compare Library of Congress Subject Headings (LCSH) terms with LibraryThing social tags. In this comparative analysis, the results show that social tags can be used to enKnowl. Org. 43(2016)No.1 KO KNOWLEDGE ORGANIZATION Official Bi-Monthly Journal of the International Society for Knowledge Organization ISSN 0943 – 7444 International Journal devoted to Concept Theory, Classification, Indexing and Knowledge Representation hance the metadata for information retrieval. But still, the uncontrolled nature of social tags is a concern and creates uncertainty among researchers. Ridenour, Laura. “Boundary Objects: Measuring Gaps and Overlap Between Research Areas.” Knowledge Organization 43 no. 1: 44-55. 35 references. Abstract: The aim of this paper is to develop methodology to determine conceptual overlap between research areas. It investigates patterns of terminology usage in scientific abstracts as boundary objects between research specialties. Research specialties were determined by high-level classifications assigned by Thomson Reuters in their Essential Science Indicators file, which provided a strictly hierarchical classification of journals into 22 categories. Results from the query “network theory” were downloaded from the Web of Science. From this file, two top-level groups, economics and social sciences, were selected and topically analyzed to provide a baseline of similarity on which to run an informetric analysis. The Places & Spaces Map of Science (Klavans and Boyack 2007) was used to determine the proximity of disciplines to one another in order to select the two disciplines use in the analysis. Groups analyzed share common theories and goals; however, groups used different language to describe their research. It was found that 61% of term words were shared between the two groups. AlQenaei, Zainab M. and Monarchi, David E. “The Use of Learning Techniques to Analyze the Results of a Manual Classification System.” Knowledge Organization 43 no. 1: 56-63. 19 references. The aim of this paper is to develop methodology to determine conceptual overlap between research areas. It investigates patterns of terminology usage in scientific abstracts as boundary objects between research specialties. Research specialties were determined by high-level classifications assigned by Thomson Reuters in their Essential Science Indicators file, which provided a strictly hierarchical classification of journals into 22 categories. Results from the query “network theory” were downloaded from the Web of Science. From this file, two top-level groups, economics and social sciences, were selected and topically analyzed to provide a baseline of similarity on which to run an informetric analysis. The Places & Spaces Map of Science (Klavans and Boyack 2007) was used to determine the proximity of disciplines to one another in order to select the two disciplines use in the analysis. Groups analyzed share common theories and goals; however, groups used different language to describe their research. It was found that 61% of term words were shared between the two groups. AlQenaei, Zainab M. and Monarchi, David E. “The Use of Learning Techniques to Analyze the Results of a Manual Classification System.” Knowledge Organization 43 no. 1: 56-63. 19 references. Abstract: Classification is the process of assigning objects to pre-defined classes based on observations or characteristics of those objects, and there are many approaches to performing this task. The overall objective of this study is to demonstrate the use of two learning techniques to analyze the results of a manual classification system. Our sample consisted of 1,026 documents, from the ACM Computing Classification System, classified by their authors as belonging to one of the groups of the classification system: “H.3 Information Storage and Retrieval.” A singular value decomposition of the documents’ weighted term-frequency matrix was used to represent each document in a 50-dimensional vector space. The analysis of the representation using both supervised (decision tree) and unsupervised (clustering) techniques suggests that two pairs of the ACM classes are closely related to each other in the vector space. Class 1 (Content Analysis and Indexing) is closely related to Class 3 (Information Search and Retrieval), and Class 4 (Systems and Software) is closely related to Class 5 (Online Information Services). Further analysis was performed to test the diffusion of the words in the two classes using both cosine and Euclidean distance. Classification is the process of assigning objects to pre-defined classes based on observations or characteristics of those objects, and there are many approaches to performing this task. The overall objective of this study is to demonstrate the use of two learning techniques to analyze the results of a manual classification system. Our sample consisted of 1,026 documents, from the ACM Computing Classification System, classified by their authors as belonging to one of the groups of the classification system: “H.3 Information Storage and Retrieval.” A singular value decomposition of the documents’ weighted term-frequency matrix was used to represent each document in a 50-dimensional vector space. The analysis of the representation using both supervised (decision tree) and unsupervised (clustering) techniques suggests that two pairs of the ACM classes are closely related to each other in the vector space. Class 1 (Content Analysis and Indexing) is closely related to Class 3 (Information Search and Retrieval), and Class 4 (Systems and Software) is closely related to Class 5 (Online Information Services). Further analysis was performed to test the diffusion of the words in the two classes using both cosine and Euclidean distance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Explaining the Methods of Architecture Representation Using Semiotic Analysis (Umberto Eco's Theory of Architecture Codes)

: In this paper, it is tried to explain the concept of representation and architectural representation through a qualitative methodology, approach its procedure for gradual creation in architecture and then according to scholars and to obtain the effect of this concept in the process of architectural facts the concepts are presented. In addition, it is referred to theories and practical texts b...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

Funds of Knowledge: An Underrated Tool for School Literacy and Student Engagement

This chief aim of this paper is to explore the concept of Funds of Knowledge (FOK) in relation to Cultural Historical Activity Theory (CHAT). This study unveils the basic tenets of FOK from the lens of activity theory and analyzes pertinent discoveries, key concepts, and scholars’ arguments relating to FOK and literacy development over time. The major purpose of this study is to expand the pers...

متن کامل

Semantic Indexing Approach of a Corpora Based On Ontology

The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models a...

متن کامل

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016